Hardware-Assisted Rootkits: Abusing Performance Counters on the ARM and x86 Architectures
نویسنده
چکیده
In this paper, a novel hardware-assisted rootkit is introduced, which leverages the performance monitoring unit (PMU) of a CPU. By configuring hardware performance counters to count specific architectural events, this research effort proves it is possible to transparently trap system calls and other interrupts driven entirely by the PMU. This offers an attacker the opportunity to redirect control flow to malicious code without requiring modifications to a kernel image. The approach is demonstrated as a kernel-mode rootkit on both the ARM and Intel x86-64 architectures that is capable of intercepting system calls while evading current kernel patch protection implementations such as PatchGuard. A proof-of-concept Android rootkit is developed targeting ARM (Krait) chipsets found in millions of smartphones worldwide, and a similar Windows rootkit is developed for the Intel x86-64 architecture. The prototype PMU-assisted rootkit adds minimal overhead to Android, and less than 10% overhead to Windows OS. Further analysis into performance counters also reveals that the PMU can be used to trap returns from secure world on ARM as well as returns from System Management Mode on x86-64.
منابع مشابه
Can Hardware Performance Counters Produce Expected, Deterministic Results?
Experiments involving hardware performance counters would ideally have deterministic results when run in strictly controlled environments. In practice counters that should be deterministic (such as retired instructions) show variation from run to run on the x86 64 architecture. This causes difficulties when undertaking certain performance-counter related tasks, such as simulator validation and ...
متن کاملNon-Determinism and Overcount on Modern Hardware Performance Counter Implementations – Extended
Ideal hardware performance counters provide exact deterministic results. Real-world performance monitoring unit (PMU) implementations do not always live up to this ideal. Events that should be exact and deterministic (such as retired instructions) show run-to-run variation and overcount on x86 64 machines, even when run in strictly controlled environments. These effects are non-intuitive to cas...
متن کاملModelling the Performance of the Gaussian Chemistry Code on x86 Architectures
Gaussian is a widely used scientific code with application areas in chemistry, biochemistry and material sciences. To operate efficiently on modern architectures Gaussian employs cache blocking in the generation and processing of the twoelectron integrals that are used by many of its electronic structure methods. This study uses hardware performance counters to characterise the cache and memory...
متن کاملUsing PAPI for hardware performance monitoring on Linux systems
PAPI is a specification of a cross-platform interface to hardware performance counters on modern microprocessors. These counters exist as a small set of registers that count events, which are occurrences of specific signals related to a processor's function. Monitoring these events has a variety of uses in application performance analysis and tuning. The PAPI specification consists of both a st...
متن کاملA Detailed Analysis of Contemporary ARM and x86 Architectures
RISC vs. CISC wars raged in the 1980s when chip area and processor design complexity were the primary constraints and desktops and servers exclusively dominated the computing landscape. Today, energy and power are the primary design constraints and the computing landscape is significantly different: growth in tablets and smartphones running ARM (a RISC ISA) is surpassing that of desktops and la...
متن کامل